Lip animation synthesis: a unified framework for speaking and laughing virtual agent

نویسندگان

  • Yu Ding
  • Catherine Pelachaud
چکیده

This paper proposes a unified statistical framework to synthesize speaking and laughing lip animations for virtual agents in real time. Our lip animation synthesis model takes as input the decomposition of a spoken text into phonemes as well as their duration. Our model can be used with synthesized speech. First, Gaussian mixture models (GMMs), called lip shape GMMs, are used to model the relationship between phoneme duration and lip shape from human motion capture data; then an interpolation function is learnt from human motion capture data, which is based on hidden Markov models (HMMs), calledHMMs interpolation. In the synthesis step, lip shapeGMMs are used to infer a first lip shape stream from the inputs; then this lip shape stream is smoothed by the learnt HMMs interpolation, to obtain the synthesized lip animation. The effectiveness of the proposed framework is confirmed in the objective evaluation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Method for Custom Facial Animation and Lip-Sync in an Unsupported Environment, Second LifeTM

The virtual world of Second LifeTM does not offer support for complex facial animations, such as those needed for an intelligent virtual agent to lip sync to audio clips. However, it is possible to access a limited range of default facial animations through the native scripting language, LSL. Our solution to produce lip sync in this environment is to rapidly trigger and stop these default anima...

متن کامل

Anthropomorphic Agent as an Integrating Platform of Audio-Visual Information

One of ultimate human-machine interfaces is anthropomorphic spoken dialog agent which behaves like humans with facial animation and gesture and make speech conversations with humans. Among numerous efforts devoted for such a goal, Galatea Project conducted by 17 members from 12 universities is developing an open-source license-free software toolkit [1] for building an anthropomorphic spoken dia...

متن کامل

MLSLib: A Lip Sync Library for Multi Agents and Languages

This article presents MLSLib, a software library for human gure animation with lip syncing. The library enables us to easily use multiple TTS systems and multiple lip motion generators, and switch them arbitrarily. It also helps use of multiple speaking agents, possibly with di erent TTS systems and lip motion generators. The MLSLib is composed of three modules: LSSAgent, TTSManager, and FCPMan...

متن کامل

Speaking with Emotions

We aim at the realization of an Embodied Conversational Agent able to interact naturally and emotionally with user(s). In previous work [23], we have elaborated a model that computes the nonverbal behaviors associated to a given set of communicative functions. Specifying for a given emotion, its corresponding facial expression will not produce the sensation of expressivity. To do so, one needs ...

متن کامل

Upper Body Animation Synthesis for a Laughing Character

Laughter is an important social signal in human communication. This paper proposes a statistical framework for generating laughter upper body animations. These animations are driven by two types of input signals, namely the acoustic segmentation of laughter as pseudophoneme sequence and acoustic features. During the training step, our statistical framework learns the relationship between the la...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015